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Abstract 

We apply matrix theory over F 2 to understand the nature of so- 
called “successful pressing sequences” of black-and-white vertex-colored 
graphs. These sequences arise in computational phylogenetics, where, 
by a celebrated result of Hannenhalli and Pevzner, the space of sortings- 
by-reversal of a signed permutation can be described by pressing se¬ 
quences. In particular, we offer several alternative linear-algebraic and 
graph-theoretic characterizations of successful pressing sequences, de¬ 
scribe the relation between such sequences, and provide bounds on the 
number of them. We also offer several open problems that arose as a 
result of the present work. 
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1 Introduction 

In a now classical paper in bioinformatics [5], Hannenhalli and Pevzner 
showed that there is a polynomial time algorithm to sort signed permutations 
by reversals, i.e., turn any signed permutation into the identity by revers¬ 
ing subwords (and flipping their signs). This has important implications for 
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computational phylogenetics: when comparing the sequence of genes of two 
related species, the shortest length of a sequence of reversals that transforms 
one into the other is one prominent measure of the evolutionary distance of 
the associated organisms. The authors’ strategy, and one that was improved 
upon in later work (for example, [8]), is to construct the so-called “breakpoint 
graph” for the permutation to be sorted, show that a certain operation on the 
breakpoint graph corresponds to reversals, and then use certain numerical 
invariants of subgraphs to guide the sequence of moves to the identity. 

This framework is now a keystone of bioinformatics algorithms, but it 
leaves many questions unanswered. In particular, the proposed method¬ 
ologies generate just one successful sorting of the signed permutation under 
consideration, and it is understood that there are often many such minimum- 
length sorting sequences. Since each is only representative of one possible 
evolutionary history, it would be valuable to be able to sample from all pos¬ 
sible such sequences to obtain more sensitive statistical properties. As of yet, 
there is no full understanding of the space of possible histories, so Markov 
Chain Monte Carlo methods are valuable for approximately uniform sam¬ 
pling. Such approaches present their own problems, however: it is necessary 
to obtain a proof of connectivity of the underlying graph of the Markov Chain 
to know that it will eventually reach every vertex; and it is necessary to obtain 
bounds on the mixing time of the process to ensure that near-uniformity will 
be achieved in reasonable time. Indeed, some researchers have investigated 
these very kinds of questions: see, for example, [S]. 

In order to state our results and situate it in the above discussion, we 
need the following definitions. 

Definition 1. A bicolored graph is a pair [G, c) where G is a simple graph, 
and c : V{G) —?■ {black, white} is a coloring of its vertices. Write black = 
white and white = black. 

Denote by V{G) the vertex set of a graph, E{G) its edge set, and ^[5”] 
the induced subgraph of a set S' C V{G); let N{v) = Ng{v) denote the 
neighborhood of u G V{G), i.e., {w G V{G) : {u,ru} G E{G)}, and N*{v) = 
Nq{v) the closed neighborhood of v, i.e., Ng{v) = Ng{v) U {u}. 

Definition 2. Consider a bicolored graph, [G, c) with a black vertex v G 
V{G). “Pressing u” is the operation of transforming {G, c) into {G', c'), a new 


2 




bicolored graph in which G[A^*(n)] is complemented. That is, V(G') = V(G), 




(where “A” denotes symmetric difference) and c'(w) = c(w) for w ^ N*(v) 
and c'(w) = c(w) for w G N*{v). 



Figure 1: The vertex enclosed by a dotted circle is pressed in graph (a) to 
obtain graph (b). 


The “pressing game” (to use terminology from [2]) is played by pressing 
black vertices of G iteratively with the ultimate goal of transforming G into 
an all-white, empty graph. Hannenhalli and Pevzner showed ([5]) that “suc¬ 
cessful” sequences of presses in the breakpoint graph of a signed permutation, 
i.e., sequences that result in an all-white empty graph, correspond bijectively 
to minimum-length sequences of reversals that turn the permutation into the 
identity. Therefore, sampling from successful pressing sequences is equiva¬ 
lent to sampling from the minimum length sequences of reversals that sort a 
signed permutation. In [2], the authors make the following “Pressing Game 
Conjecture”: 

Conjecture 1. Every successful pressing sequence can he reached from ev¬ 
ery other one by a sequence of edits that involve at most four deletions or 
insertions. 

If successful pressing sequences are taken to be the vertices of a graph 
n(G'), and the edges correspond to edits of at most four deletions or inser¬ 
tions, then the Pressing Game Conjecture implies that n(G) is connected. 
Then a simple random walk converges to a uniform distribution on the set of 


3 






all successful pressing sequences, and Markov Chain Monte Carlo can be used 
to analyze typical pressing sequences. Bixby, Flint, and Miklos [2] proved the 
conjecture for paths. Despite this, the current authors have doubts about 
the statement for general graphs. 

In the present manuscript, we explore a few aspects of matrix theory 
over F 2 so as to better understand the successful pressing sequences of a 
graph. Among other results: in Corollary |U we show that the rank of the 
“augmented adjacency matrix” of a bicolored graph is the length of every 
successful pressing sequence of a graph; Theorems [7] and [8] provide a substan¬ 
tial collection of equivalent characterizations of successful pressing sequences; 
Proposition HD] gives a matrix-theoretic formulation of the relationship be¬ 
tween successful pressing sequences; and Theorem [T2] shows that the average 
number of successful pressing sequences of a random (full-rank) bicolored 
graph is large. The final section contains several open problems concerning 
these sequences that arose in connection with the present work. 

We note that some special cases of a few of our results are announced 
but left largely unproven in [B]; the authors refer to the matrix analogue of 
pressing as “clicking” and to the condition of the existence of a successful 
pressing sequence that consists of all vertices as “tightness.” 

2 Preliminaries 

The following result appears (less explicitly) in [5] and [1], but we include 
the proof for completeness^ 

Proposition 1. Any graph with a black vertex in every component has a 
successful pressing seguence. 

Proof. It suffices to prove the statement for connected graphs. Let G be a 
connected graph and X be the set of black vertices with the fewest possible 
black neighbors. Choose some x & X such that deg(a;) is maximal in X. 
When X is pressed, we obtain G' . We claim that each component of G' is 
either a white isolated vertex or has at least one black vertex. 

Let N = Nq{x) be the set of neighbors of x G G, let P be the set of 
vertices in N that were white in G, and lei Q = N\P. Note that G and G' 

Mhanks to Eva Czabarka for suggesting this vastly simplified version of the 
Hannenhalli-Pevzner argument. 
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are identical except on the induced subgraphs of U {x}. Every vertex in 
V(G) — {N U {x}) is in a component with a vertex of N (in both G and G'), 
so it suffices to show that each vertex of iV U {x} is in a component of G' 
with a black vertex or is an isolated white vertex. Furthermore, in G', x is 
isolated and white and the vertices of P are black, so we need only consider 
the elements of Q. 

Pick some z E Q. If 2; is adjacent to a black vertex outside of U {x} 
or 2 : is not adjacent in G to some vertex in P, then in G', z is adjacent to a 
black vertex. Otherwise, in G, z is adjacent to all vertices of P and its black 
neighbors are a subset of Q U {x} \ {z}. By the choice of x, this implies that 
the closed neighborhoods of x and .2 are the same in G, which implies that z 
is a white isolated vertex in G'. □ 

Definition 3. The augmented adjacency matrix A(G) G of a bi¬ 
colored graph G on n vertices, is the adjacency matrix of G where the entries 
along main diagonal correspond to the vertices G and are indexed by the 
color of the vertex; 0 if white or 1 if black. 

Given a bicolored graph G, we can dehne a (loopy simple, uncolored) 
graph G to be the graph on the same vertex set with the same edges, but 
with a loop at each black vertex (and none at white vertices). A perfect 
matching in such a graph is a set of edges incident to every vertex exactly 
once, where a loop is considered to be incident to its vertex only once. A 
special case of the following result (that of zero diagonal) appears in [1]. 

Proposition 2. The number of perfect matchings in the loopy graph Q cor¬ 
responding to a bicolored graph (G, c) is odd if and only if A{G) is invertible 
over F 2 . 

Proof. It is well known that the permanent (which is equal to the determinant 
in characteristic 2) of A{G) is equal to 

c 

where C ranges over all vertex circuit covers, i.e., families of circuits (closed 
walks) in which each vertex appears exactly once and Z{C) is the number of 
such circuits of length greater than two. (See, for example, [3].) Therefore, 
over F 2 , the only terms which make a contribution to det(A(G)) are those in 
which there are no circuits of length more than two, i.e., every component is 
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a loop or a single edge - precisely the condition of being a perfect matching. 
Since det(y4(G')) = 1 if and only if A{G) is invertible, this is equivalent to 
there being an odd number of perfect matchings. □ 


3 Matrix Theory 


Dehne the function f{M) on n x n nonzero matrices over F 2 as follows; the 
action of / will amount to a slight modification of Gaussian elimination, 
wherein row-permutations are prohibited and the pivot row is added to itself 
as well. Let s denote the smallest row index of a left-most 1 in M, that is, 
the positive integer for which there exists a t so that 

1 . Ms,t = 1 

2. Mgj = 0 if j < t 

3. If z < s and j <= t, then Mij = 0. 

Note that s and t are uniquely determined by U in satisfying the above 
requirements. Let U = U{M) be the set of row indices which have a 1 in 
column t, i.e.. 


U = {t: M,,, = 1}. 

Let f{M) denote the n x n matrix so that 



Note that, for every matrix M, there is a sequence of s’s and f’s that arise 
from the iterative application of / to M. That is, given M, there is an 
increasing sequence Si, S 2 ,..., Sp and increasing sequence ti, ...,tp which serve 
as the indices in the above dehnition of f{M), f{f{M)), etc. Indeed, it is 
easy to see that the sequence must eventually result in the all-zeroes matrix, 
so this process terminates at some finite p = p{M). 

If, for each r G [p], s,. = r, we call M “leading principally nonsingular 
(LPN)”. If M is LPN, then the sequence M,/(M),/(/(M)),...,/(p)(M), 
p = rank(M), is precisely the sequence of matrices one obtains by performing 
Gaussian elimination on M, with the additional operation of adding pivot 
rows to themselves (thus replacing them with the zero vector). Furthermore, 
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this elimination does not involve row permutations. Therefore, M is row- 
reducible to a diagonal matrix whose leading principle submatrix of size p is 
the identity matrix Ip, and whose other entries are zero. 

Note that, if M is symmetric, then f{M) is, as well. Indeed, 


~ ^^i,j + Ms,j • Mi^s 


by the dehnition of / 
by the symmetry of M 


— Mj^i + Mj^s • 

= Mgp ■ Mj^s 


by the dehnition of /. 


Therefore, suppose that the M above is A{G) and is LPN. Then it is straight¬ 
forward to see that f{M) is in fact A{G') where G' is obtained from G by 
pressing its lowest-indexed (black) vertex. Since A{G) being the all-zeroes 
matrix is precisely the condition that G has no edges and all vertices are 
white, M = A{G) being LPN is equivalent to G having a successful pressing 
sequence consisting of the vertices indexing the hrst rank(A(G')) columns of 
M in increasing order. 

We may conclude the following. 

Proposition 3. For a graph G with vertex set [n], the following are eguiva- 
lent: 

1. Gaussian elimination applied to A{G) consists of row reduction applied, 
in consecutive order, to the first rank(y4(G')) = k rows. 

2. The first k leading principal minors of A{G) are nonzero, and the rest 
are zero. 

3. 1,2,... ,k is a successful pressing seguence for G. 

Corollary 4. The number of vertices in any successful pressing seguence for 
a graph G depends only on the graph, and is egual to rank(A(G)). 

The preceding result justihes the following dehnition. 

Definition 4. The pressing number of a graph is the number of presses 
required to transform the graph into an all-white, empty graph. 

A matrix M is said to have an Lf/-decomposition if there exist a lower 
triangular matrix L and an upper triangular matrix U so that 


M = LU. 
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Call a matrix M “Cliolesky” (to borrow terminology from the theory of 
real/complex matrices; q.v. [7]) if there exists a lower-triangular L so that 
M = LLF] such a product is evidently a special type of L[/-decomposition. 
The following lemma is folkloric. 

Lemma 5. If M = LU and M is invertible, this decomposition is unique. 
Proof. Suppose M = LU = L'U'. Then L, U, L', and U' are invertible, so 

L'-^L = U'U-\ 

The left-hand side of this equation is lower-triangular and the right is upper- 
triangular, so they must both be diagonal. However, since the only invertible 
diagonal matrix over F 2 is the identity, L = L' and U = U'. □ 

Lemma 6. If a symmetric matrix M ot>er F 2 has an LU-decomposition, then 
it has a Cholesky decomposition LL^. 

Proof. We proceed by induction. The base case is trivial: M = [0] or M = 
[1]. Suppose M is nxn and the statement is true for all 1 < A; < n. If the first 
row of M is all zero, then Gaussian elimination of M can proceed without 
row permutations if and only if this is true of M', the matrix M with its 
first row and column removed. Since the existence of an L[/-decompostion is 
equivalent to Gaussian elimination proceeding without the necessity of row 
permutations, it follows that we may apply the inductive hypothesis to M'. 
Suppose the first row of M is nonzero. Then M can be written 

M = LU 


Lo O' 

' Uo 

C ' 

A B 

0 

D 

LoUo 

LoC 



AUo AC + BD 

where Lq are Uq are invertible leading principal submatrices, B is lower- 
triangular and D is upper-triangular - because, otherwise, the first row or 
column of M would be zero. Since Mq = LqUq is nonsingular, this decompo¬ 
sition of Mq is unique by Lemma |5l Thus, Mq = Mq = Uq Lq implies that 
Uq = Lq. Since M is symmetric. 


LqC = {AUof = UlA^ = LqA^ 








whence C = . Therefore, if Mi = AC + BD is the lower-right {n — k) x 

{n — k) principal submatrix of M, we may rewrite 

Ml - AA^ = BD. 


Since the left-hand side is symmetric, the right-hand is as well, and we may 
apply induction: Mi — AAA has an L[/-decomposition of the form BD, so it 
has an LiL^-decomposition as well. Therefore, we may let 


Lo O 
A Li 


and conclude that M = LL^. 


□ 


Theorem 7. Given a hicolored labeled graph G and integer k, the following 
are eguivalent: 

1. The pressing number of G is k. 

2. A{G) has rank k and can be written 

A{G) = P^LL^P 

for some lower-triangular matrix L and permutation matrix P. 

3. rank{A{G)) = k and G has a black vertex in each component that is 
not an isolated vertex. 

4 . There is some permutation matrix P so that the j-th leading principal 
minor of P^A{G)P is nonzero for j G [k] and is zero for j > k if 
k < n. 

5. There is an ordering of the vertices Vi,... ,Vn of G so that the induced 
subgraph G[{ni,... ,Vj}] has an odd number of perfect matchings for 
each j G [k], and, for each j G [n] \ [k], G[{vi,... ,Vj}] has an even 
number of perfect matchings. 

6. A{G) = P^LUP for some permutation matrix P, lower triangular ma¬ 
trix L, and upper triangular matrix U, where rank(L[/) = k. 

We note that there are yet other equivalent conditions in terms of rank 
of submatrices which are given by Johnson and Okunev in HD]. 
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Proof. 1 6: This follows from Proposition [31 since the existence of an 

L17-decomposition is equivalent to a matrix being row-reducible without per¬ 
forming row permutations; conjugation by P has the effect of placing rows 
and columns indexed by the pressing sequence in an initial position of the 
matrix. 

2 6: This is a consequence of Lemma [HI 

1 3: The combination of Proposition [T] Proposition [31 and Corollary [H 

gives this equivalence. 

1 4: This follows from Proposition [31 

4 5: This is a consequence of Proposition [2l 

All conditions are therefore equivalent by transitivity. □ 

We may consider the special case when the labeling provides a successful 
pressing sequence. 

Theorem 8. Given a bicolored labeled graph G on [ri\, the following are 
eguivalent: 

1. The vertices ofG, in the usual order, are a successful pressing seguence. 

2 . A{G) can be written 

A{G) = LL^ 

for some invertible lower-triangular matrix L. 

3. Every leading principal minor of A{G) is nonzero for j G [n]. 

4- The induced subgraph G[{1,..., j}] has an odd number of perfect match¬ 
ings for each j G [n ]. 

5. A{G) = LU for some invertible lower triangular matrix L and invertible 
upper triangular matrix U. 

Proof. All statements are simply specializations of those from the preceding 
Theorem, except for 2. That L (and not just A{G)) can be taken to be full 
rank follows from the fact that rank(L) = rank(LL^) if L is invertible. □ 
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4 Enumeration and Pressing Sequences 

Proposition 9. Given a graph G with vertex set [n] and a permutation a 
of [n], there is exactly one bicoloring of G for which a is a valid pressing 
seguence. 

Proof. We apply the part of Theorem [7] that says that a is a valid pressing 
sequence if and only if G[cr(l),..., cr{k)] has an odd number of perfect match¬ 
ings for each fc G [n]. The proof proceeds by induction on k. If k = 1, clearly 
the number of perfect matchings is odd if and only if G[(t( 1)] has a loop, 
so cr(l) must be black in the bicoloring. Suppose the statement is true for 
k < n. Let a denote the number of perfect matchings of , cr(A;)], h 

the number of perfect matchings of , (j{k + 1)], and c the number 

of perfect matchings of G'[{cr(l),.. .,a{k)} \ N{a{k + 1))]. Then b = c if 
a{k + 1) does not have a loop, while h = a + cif a{k + 1) does have a loop. 
Since a is odd by the inductive hypothesis, exactly one of c or a -|- c is odd, 
so the color of a{k + 1) is uniquely determined. □ 

As with real/complex matrices, a matrix Q over F 2 is said to be “orthog¬ 
onal” if Q^Q = I. The set of all n x n orthogonal matrices over F 2 is the 
“orthogonal group” and is denoted 0(n). 

Proposition 10. Suppose the bicolored graph G with vertex set [n\ has the 
identity permutation as a pressing seguence, let A be the augmented adjacency 
matrix of G, and let A = LL^ be the Gholesky decomposition guaranteed 
by Theorem Then a is also a pressing seguence of G iff there exist an 
orthogonal matrix Q and an upper triangular matrix U so that 

= QU, 

where P is the permutation matrix encoding a. 

Proof If L^P^ = QU, then 

PAP'^ = PLL'^P^ = U^Q'^QU = U^U, 

a Gholesky decomposition for the matrix PAP^. On the other hand, suppose 
(T is a pressing sequence for G. Then 

PAP'^ = U'^U. 
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for some U, whence 


PLL'^P^ = U^U. 


Let Q = L^P^U then Q^Q = I, so Q is orthogonal, and 


L^pT = QU. 


□ 

It is worth remarking that one may take P to be any permutation matrix 
representing an automorphism of G, Q = /, and U = to obtain a solution 
to P^ = QU. Indeed, acting on a bicolored graph by an automorphism 
hxes its pressing sequences. Therefore, by the above proposition, one may 
view successful pressing sequences as a kind of F 2 -relaxation of automor¬ 
phisms. 

Given a matrix B, we define a new matrix as follows. Let the 

columns oi B he bi,... ,bn and the columns of 'il^{B), ..., If ..., 6^ 

have been defined for k < n, we define 

k 

Note that we can also define by 

K+l = ^k+i + ^ b'j. 

j<k 

bObk+i=l 

Proposition 11. Suppose the bicolored graph G with vertex set [n] has the 
identity permutation as a pressing seguence, let A be the augmented adjacency 
matrix of G, and let A = LL^ be the Cholesky decomposition guaranteed by 
Theorem [3 Let a be a permutation of [n] and P is the permutation matrix 
encoding a. Then a is a valid pressing seguence for G iff the all-ones vector 
1 is a (left) eigenvector of P'^) iff P'^) is orthogonal. 

Proof. Note that the computation of fiB) is precisely that of performing 
the Gram-Schmidt algorithm on the columns of B, except that no “normal¬ 
ization” occurs, i.e., one does divide by the norm of the resulting vectors. 
However, over F 2 , there are only two possible “norms”: 0 and 1. Therefore, 
if the norm of each of the b) produced in the computation oi flB) is 1 for 
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all j, the columns of 'ip{B) are the same as the output of Gram-Schmidt 
orthonormalization, whence we obtain a factorization of B of the form QU 
with Q orthogonal and U upper triangular. (This is usually termed a “QR- 
factorization”.) The only failure mode of this computation is if some bj 
is self-orthogonal. Since self-orthogonality is equivalent to having an inner 
product of 0 with 1, by Proposition [TUI if a is a successful pressing sequence, 
P'^) is orthogonal and P'ijj{L'^P'^) = 1^; otherwise, P"’") is not 
orthogonal and P^) ^ i^. □ 

Theorem 12. For an (ordinary) graph G on n vertices, let uq denote the 
average number of length-n successful pressing seguences over all bicolorings 
ofG. Then 

n\ 

Proof. We construct a bipartite graph T as follows. One partition class S 
consists of all permutations of [n]; the other partition class is G, the set of 
all bicolorings of G, which we assume has vertex set [n]. We place an edge 
between a permutation a and a bicolored graph G iff a is a successful pressing 
sequence for G. By Proposition |9l the P-degree of each vertex in S is 1, so 
the number of edges in P is u!. On the other hand, the number of edges 
incident to a bicolored graph is its number of length-n successful pressing 
sequences. Therefore, the sum of all degrees in G is We may conclude 

that 


Note that, since a particular labeling of a graph admits precisely one 
bicoloring so that rank(A(G)) = n, the probability that a symmetric matrix 
(i.e., the adjacency matrix of a bicolored graph) is Cholesky 2“”'. 


5 Conclusion 

We present a few open questions on the subject of pressing sequences in 
addition to the Pressing Game Gonjecture (discussed in the introduction). 

Question 1. How hard is it in general to compute the number of successful 
pressing seguences of a given bicolored graph? 
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By the remarks following Proposition [HI it is perhaps the case that this 
enumeration problem is Gl-complete, i.e., the same difficulty as certifying 
graph isomorphism and counting automorphisms. Alternatively, the connec¬ 
tion with counting perfect matchings suggests it might be T^P-hard. Given 
that we do not know the complexity of counting pressing sequences exactly, 
perhaps the approximation problem is easier: 

Question 2. Is there a polynomial time algorithm for approximating within 
a small factor the number of successful pressing seguences of a given bicolored 
graph ? 

In studying some of these questions, the authors found a substantial, 
though small, number of nonisomorphic graphs which have exactly one press¬ 
ing sequence - graphs we term “uniquely pressable”. However, we lack a 
characterization of these graphs. 

Question 3. Describe the uniguely pressable graphs. 
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